-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config parser (and loader) #4744
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Research updates:
|
Here's an example to illustrate how the config parser "executes" the config file. Given this config: foo.bar = 'hello'
foo {
bar = 'hello'
}
includeConfig 'foobar.config' The parser transforms this code into the following: assign(['foo', 'bar'], 'hello')
block('foo', {
assign(['bar'], 'hello')
}
includeConfig('foobar.config') Basically each statement is translated to a DSL method (see I think this is a good model for how we might separate the Nextflow language from the runtime. Instead of hooking directly into Groovy methods and relying on Groovy MOP, metaclass, etc, we can introduce a hidden DSL which allows us to better control how a given statement (e.g. invoking a process in a workflow) is executed by the runtime. |
Java security manager is being removed. I will keep an eye out for other sandboxing techniques, but in the meantime, a simple solution is to provide a "safe" execution mode where assignment expressions are wrapped in string literals. |
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Really like the way that this is going, great work 👏🏻 |
@bentsherman can you please solve the conflicts? |
The PR I wanted you to review is this one -- #4201 However I just looked at it and it's not really in a review-able state. I'll ping you again when it's more complete. |
Close #2723
This PR explores one possible solution to the config overhaul by implementing a custom parser for config files, using a similar approach as #4613.
The linked issue initially proposed adopting a new config format. However, most of the proposed alternatives do not have all of the capabilities that we need out of the box or have their own downsides. I think the actual problem is that the config file has been stretched to cover use cases for which it wasn't originally designed.
Between this PR, the module config (#4510), and moving the params definition from the config file to the params schema, I think this will relieve enough pressure from the config file that we can keep it.
This custom parser enforces a much simpler syntax:
foo.bar.baz ...
).As a result, the config parser is much simpler and the code is easier to read. There is no more dependence on metaclasses, instead there is a simple builder DSL for executing a config file and producing a map.
The vast majority of existing configuration is still supported, with only a few minor changes needed:
Include statements are sometimes wrapped in a try-catch to prevent a missing config file from failing the pipeline. For this we can either provide both
includeConfig
andrequireConfig
, or demote the error to a warning when the config file is remoteCustom functions like
check_max
are no longer supported. Instead, I suggest that we import thelib
directory into config files as we do for scripts, so that these helper functions can simply be defined thereConditional statements are no longer supported. With module config they should no longer be needed
Params should be defined only in
nextflow_schema.json
to prevent issues like Allow custom configsparams
to be parsed before nextflow.config #2662 where a param is referenced before params are fully resolved. More of a best practice than a hard requirement. We can still support params defined in the config (likely with a warning), and overriding param default values in a profile is still a valid use case.All of these issues can be resolved independently of this PR, in case we need to smooth the migration process.
It is still possible to execute arbitrary code with side effects, just not as easily as before. We can't really avoid this if we want to support dynamic expressions:
TODO:
lib
directory for using helper functions in config files